Zero-Day Vulnerability Investigation – MSHTML Challenge (CVE-2021-40444)

Challenge Description

This LetsDefend challenge is based on the Microsoft MSHTML zero-day vulnerability that was found in 2021. According to Microsoft, this vulnerability involved a hacker tricking someone into opening a dangerous Office document, like a Word file, which secretly contained web-like features powered by Microsoft’s built-in browser engine (MSHTML). When opened, the file activates a hidden tool called an ActiveX control, a small executable piece of code, which could then take control of the computer and do harmful things like installing malware or stealing data, often without the user even noticing.

For this challenge, we have to analyze four Word documents, and answer each questions one at a time.

The challenge can be found here.

Tools available

XLMMacroDeobfuscator-master.zip
numbers-to-string.py
Tools/oledump_V0_0_60.zip
oletools-master.zip
re-search.py
reextra.py
reextra.pyc
xmldump.py
zipdump.py

Challenge Walkthrough

Question 1: Examining the Employees_Contact_Audit_Oct_2021.docx file, what is the malicious IP in the docx file?

A Docx file, being a type of Office Open XML (OOXML) format, is essentially a compressed ZIP archive. This archive typically contains various XML files that define the document's structure, content, and properties, and may also include other media files like images or videos. To begin our analysis, we will use the zipdump.py tool which will allow us to examine the internal structure and content of this ZIP archive, specifically looking for embedded objects, analyzing the XML structures, and identifying any potentially suspicious files or anomalies that might warrant further investigation.

The following command reports the content of the specified ZIP file:

python3 zipdump.py ~/Desktop/ChallengeFiles/Employees_Contact_Audit_Oct_2021.docx

The zipdump.py output for Employees_Contact_Audit_Oct_2021.docx appears standard for a docx document, lacking immediate malicious indicators like VBA macros (vbaProject.binor word/vbaData.bin), embedded executables (.exe, .dll), script files (.ps1, .bat, .vbs, .js), or unusually named files. However, this doesn't preclude hidden malicious external links or objects. Therefore, our next step is to analyze the document's XML files for such links, focusing on word/_rels/document.xml.rels as it's designed to define external relationships.

We will use the following command which extracts the raw content of the ninth stream (index 9); word/_rels/document.xml.rels, from the compressed .docx file and then pipes that XML content to xmldump.py for pretty-printed (indented and human-readable) display:

python3 zipdump.py -s 9 -d ~/Desktop/ChallengeFiles/Employees_Contact_Audit_Oct_2021.docx | python3 xmldump.py pretty

This XML output from word/_rels/document.xml.relswhich represents the document's relationships, indicates a highly suspicious external link via Relationship Id="rId5". Its Type is oleObjectand its Target is an mhtml URI pointing to the IP address 175.24.190.249, attempting to retrieve note.html. This strongly suggests an attempt to exploit a vulnerability to fetch and potentially execute content from a remote server.

Additionally, the following XML output from the word/document.xml (index 10) confirms the previous observation; linked OLEobject (Object Linking and Embedding) of type htmlfile with r:id="rId5"which strongly indicates a potential external connection or exploit attempt. The other relationships don't indicate malicious activity for the .docx file.

To confirm our observation and verify the reputation of this suspicious IP address , we'll consult Threat Intelligence platforms such as VirusTotal and AbuseIPDB. We'll use both for a more complete analysis.

As shown by the security vendors' analysis in Virustotal, the IP address 175.24.190.249 is flagged as "Malicious" which indicates that their own intelligence and detection engines have identified it as a threat. Additionally, if we take a look at the "RELATIONS" tab, we can also see that this IP address communicates with the Employees_Contact_Audit_Oct_2021.docx file, which confirms that this file is infected with an embedded link leading to the IP address.

We'll also take a look at AbuseIPDB:

Despite confirmed malicious activity in our prior analysis and on VirusTotal, the IP's absence from AbuseIPDB suggests its behavior was targeted or specialized, rather than generating the broad-spectrum abuse traffic typically flagged by AbuseIPDB. This aligns with the characteristics of a zero-day vulnerability, which is often exploited discreetly to evade detection. Furthermore, the suspicious detail that the IP originates from China, despite being associated with an English-language document, adds weight to the assessment of malicious intent and points toward potential involvement by a foreign adversary.

Answer to the question: 175.24.190.249

Question 2: Examining the Employee_W2_Form.docx file, what is the malicious domain in the docx file?

Since this file is also a .docx file, we need to extract its content to see which XML files it contains. We will use the same command to achieve this:

python3 zipdump.py ~/Desktop/ChallengeFiles/Employee_W2_Form.docx

Again, the zipdump.py output for Employee_W2_Form.docx appears standard for a docx document. Looking at the XML file names, there are no indication of VBA macros, embedded executables, script files, or unusually named files.

However, just like with the previous file, this doesn't preclude hidden malicious external links or objects. Since we are looking for a domain name, our next step will be to analyze the document's XML files for such information. We will focus on word/_rels/document.xml.rels as it's designed to define external links.

The following command extracts the raw content of the thirteenth stream (index 13), which is word/_rels/document.xml.rels, from the compressed .docx file and then pipes that XML content to re-search.py to search that content for any common web addresses, domain names, or IP addresses:

python3 zipdump.py -s 13 -d ~/Desktop/ChallengeFiles/Employee_W2_Form.docx | python3 re-search.py "(http[s]?://[^\s\"']+|[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+|(?:\d{1,3}\.){3}\d{1,3})"

Here, the suspicious element is the domain arsenal.30cm.tw appearing twice in relation to oleObject and word.html. This suggests a potential OLE exploit attempting to load content from an external, non-standard domain within a Word document, which is a common tactic for malware delivery or command and control. The domain arsenal.30cm.tw should be further investigated with VirusTotal and ANY.RUN.

The domain arsenal.30cm.tw is flagged as "Malicious" by multiple security vendors on VirusTotal. Furthermore, examining the "RELATIONS" tab reveals its presence in various other files, some of which attempt to connect to or embed the domain upon opening or execution. These strongly indicate the malicious nature of arsenal.30cm.tw. As best practice, we will further investigate this domain using a sandboxed environment like ANY.RUN. This will provide the crucial behavioral context that static analysis tools like VirusTotal might miss, offering a comprehensive view of how the domain interacts within a live system.

The lack of activity in ANY.RUN is attributed to the domain being unreachable/offline at the time of dynamic analysis, which does not negate its past or potential malicious nature. While our existing evidence found in Virustotal is strong, it's always good practice in cybersecurity to gather as much intelligence as possible. We'll take a look at another Threat Intelligence platform called OTX.

This OTX output definitively confirms arsenal.30cm.tw is malicious because it explicitly states the domain "hosted" and "communicated" with "1 malicious files," one of which is specifically identified as "SLF:MamacseMacro.A" and confirmed to be a Trojan via its SHA-256 hash. Additionally, the presence of an "Open directory" highlights a severe vulnerability exploited to distribute malware, and the Passive DNS records link the domain to IP addresses known for malicious activity on DigitalOcean which is a cloud hosting provider frequently abused by threat actors (.

Answer to the question: arsenal.30cm.tw

Question 3: Examining the Work_From_Home_Survey.doc file, what is the malicious domain in the doc file?

In this case, we're analyzing a .doc file which, unlike a .docx, is a binary, uncompressed Word document based on the OLE (Object Linking and Embedding) format. To identify any embedded malicious domains, we’ll use OLE analysis tools such as oleid.py and oleobj.py from the oletools suite. These tools help detect macros, auto-execution triggers, and external object references that may reveal links to malicious domains.

The following command performs a quick analysis of the .doc file to detect macros, external relationships, and other suspicious features like OLE objects or auto-execution:

python3 oleid.py ~/Desktop/ChallengeFiles/Work_From_Home_Survey.doc

The output from the oleid tool indicates the presence of 1 external relationship, which is suspicious because .doc files typically don't use external relationships like .docx files do. If oleid flags one, it may suggest an embedded object or link crafted to connect to a remote server, a common tactic in phishing or malware delivery.

To investigate further, we’ll use the oleobj tool to extract and analyze any embedded OLE objects that might be responsible for this external relationship.

The following command will attempt to extract embedded objects and reveal any URLs or remote references, helping us determine if the file is malicious:

python3 olobj.py ~/Desktop/ChallengeFiles/Work_From_Home_Survey.doc

The output of the oleobj command reveals a suspicious URL pointing to the domain trendparlye.com, which raises red flags due to its unusual name and potential use in phishing or malware delivery. The output also indicate that the suspicious URL is potentially used as an exploit for CVE-2021-40444. To investigate further, we’ll check this domain on VirusTotal. This will help us confirm that trendparlye.com has been flagged as malicious, linked to known malware campaigns, or found embedded in other suspicious files or URLs.

Multiple trusted security vendors classify the domain trendparlye.com as malicious or malware-related, reinforcing its threat profile. Additionally, in the "RELATIONS" tab, This domain is flagged as malicious, with 9 out of 94 engines detecting it as suspicious. Passive DNS records show it resolved to IP 99.83.154.118 in 2022 with detections, while an earlier resolution to 127.0.0.1 likely reflects sandbox behavior or malware redirection. It has been linked to five malicious files, including RAR, ZIP, and Office Documents such as the one we are currently analyzing, confirming its malicious nature, and suggesting its use in phishing or malware distribution campaigns involving fake surveys and criminal-themed lures.

Answer to the question: trendparlye.com

Question 4: Examining the income_tax_and_benefit_return_2021.docx file, what is the malicious domain in the docx file?

Here, just like for Question #2, we are dealing with a .docx file and looking for a malicious domain. To do so, we need to extract the content of the income_tax_and_benefit_return_2021.docx file to see which XML objects it contains.

We will use the same commands as before to achieve this:

python3 zipdump.py ~/Desktop/ChallengeFiles/income_tax_and_benefit_return_2021.docx

Again, the zipdump.py output for income_tax_and_benefit_return_2021.docx appears standard for a docx document. However, just like previously, we will focus on word/_rels/document.xml.rels (index 19) as it's designed to define external links.

We will use the following command to search the content of the .docx file for any common web addresses, domain names, or IP addresses:

python3 zipdump.py -s 19 -d ~/Desktop/ChallengeFiles/income_tax_and_benefit_return_2021.docx | python3 re-search.py "(http[s]?://[^\s\"']+|[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+|(?:\d{1,3}\.){3}\d{1,3})"

The output indicates that the .docx file contains an embedded external relationship pointing to the domain hidusi.com, specifically to the URL http://hidusi.com/e8c76295a5f9acb7/side.html. Again, this is suspicious because such external links in Office documents are often used to load malicious content or track user activity. For a more in-depth analysis, we will investigate the domain further with VirusTotal and ANY.RUN.

The Virustotal output from the "DETECTION" tab indicates that the domain hidusi.com is classified as malicious by multiple trusted security vendors. Additionally, in the "RELATIONS" tab, This domain is also flagged with 15 out of 94 engines detecting it as suspicious. Passive DNS records show it has resolved to multiple IP addresses with a history of detections. The domain is associated with numerous malicious files, including Office Open XML documents and executables with high detection rates (e.g., bpekybwytq.exe at 44/67). Some of these files use names associated with known exploits, indicating the domain is likely used for distributing malware via phishing attacks.

The IoCs from the ANY.RUN analysis indicate that the domain hidusi.com is highly suspicious. The analysis shows that visiting this URL leads to the dropping of multiple files with suspicious SHA256 hashes into temporary and browser data directories, specifically within folders related to an unpacker process. This activity is strongly indicative of a malicious browser extension being installed. The domain also makes numerous requests for images, stylesheets, and other files, which is a common tactic used to mimic a legitimate website. Additionally, there are DNS requests to other domains, suggesting potential communication with other services. The overall behavior points to a compromised session resulting from visiting the domain.

Answer to the question: hidusi.com

Question 5: What is the vulnerability the above files exploited?

To answer this question, we will retrieve the hash value from each file and verify which vulnerability they are related to with VirusTotal.

We will use the following command which allows to retrieve the MD5 hash value of the specified file:

md5sum filename

The hash value analysis for each file lead to the cve-2021-40444 MSHTML vulnerability that was found in September 2021.

Answer to the question: cve-2021-40444

Mitigation

Apply the Latest Microsoft Security Update
- Install the security patch relevant to address CVE-2021-40444.
Block ActiveX controls in Office Documents
- Use Group Policy or registry settings to disable ActiveX controls, which were leveraged in this exploit.
Disable document preview in File Explorer
- Turn off the “Show preview handlers in preview pane” setting in File Explorer to prevent automatic rendering of potentially malicious files.
Educate Users About Phishing and Malicious Attachments
- Train users to recognize and avoid suspicious emails, links, and file attachments, especially from unknown sources.

References

Cheat Sheet for Analyzing Malicious Documents | SANS Institute

Microsoft shares temp fix for ongoing Office 365 zero-day attacks

Office Open XML - What is OOXML?

Windows MSHTML zero-day exploits shared on hacking forums

Didier Stevens | (blog 'DidierStevens)

DidierStevensSuite/zipdump.py at master · DidierStevens/DidierStevensSuite · GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zero-Day Vulnerability Investigation – MSHTML Challenge (CVE-2021-40444)

Challenge Description

Tools available

Challenge Walkthrough

Question 1: Examining the Employees_Contact_Audit_Oct_2021.docx file, what is the malicious IP in the docx file?

Question 2: Examining the Employee_W2_Form.docx file, what is the malicious domain in the docx file?

Question 3: Examining the Work_From_Home_Survey.doc file, what is the malicious domain in the doc file?

Question 4: Examining the income_tax_and_benefit_return_2021.docx file, what is the malicious domain in the docx file?

Question 5: What is the vulnerability the above files exploited?

Mitigation

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Zero-Day Vulnerability Investigation – MSHTML Challenge (CVE-2021-40444)

Challenge Description

Tools available

Challenge Walkthrough

Question 1: Examining the Employees_Contact_Audit_Oct_2021.docx file, what is the malicious IP in the docx file?

Question 2: Examining the Employee_W2_Form.docx file, what is the malicious domain in the docx file?

Question 3: Examining the Work_From_Home_Survey.doc file, what is the malicious domain in the doc file?

Question 4: Examining the income_tax_and_benefit_return_2021.docx file, what is the malicious domain in the docx file?

Question 5: What is the vulnerability the above files exploited?

Mitigation

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages